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^ « 's mere or tess impossible to write 

t about the Warnier/Orr methodol- 
ogy because, in fact, there is no 
such thing. While there are War- 
nier/Orr diagrams, Warmer's method- 
o!ogy {i.e.. logical data structure, logical 
construction of systems, and logical con- 
struction of programs), and Ore's meth- 
odology (data-structured systems devel- 
opment), there is not, strictly speaking, a 
Warnier/Orr methodology. 

Many software engineers confuse dia- 
grams with methodologies. Perhaps this 
is natural, since the diagrams are the 
most visible part of most methodologies; 
but it's unfortunate, for methodologies 
are much more than just a set of diagrams 
and syntax rules. 

Within the context of software engi- 
neering, a method is a procedure or tech- 
nique for performing some significant 
portion of the software life cycle. Over 
the years, techniques hsve been devel- 
oped for requirements definition, data- 
base design, program design, test-case 
development, and so on. h methodology, 
in software engineering terms, is a col- 
lection of methods based on a common 
philosophy that fit together in a frame- 
work called the systems development life 
cycle. 



Methods cften use a variety of took: 
diagrams, forms, and text for document- 
ing and communicating. Not surprising- 
ly, these diagrams and forms often take 
on a life of their own. Diagrams, like 
words, can be used out of context, with- 
out understanding the purpose for which 
they were intended. While the results can 
be confusing, new possibilities and uses 
often arise that are quite fortuitous. 

People who develop software engi- 
neering methods and methodologies at- 
tempt to solve problems, observe what 
others do, and derive, or abstract, pat- 
terns from all this. Those patterns ulti- 
mately turn into methodologies. 

In my experience, my colleagues and I 
always know what works long before we 
know why it works. Software engineer- 
ing methodologists are skilled at work- 
ing with experts, such as analysts, pro- 
grammers, database administrators, and 
so forth, finding out how these experts 
do what they do, and putting these find- 
ings down in such a way that others can 
follow them. 

The correct n?me for what many peo- 
ple cd! the Warnier/Orr methodology is 



data-structured systems development. 
DSSD, like most methodologies, is actu- 
ally the result of many people's efforts, 
in addition to my own, including my co- 
workers at Optima, colleagues, and cli- 
ents. Much of the methodology has come 
about by taking various component tech- 
nologies, such as structured program- 
ming and relational-database design, and 
putting them together into a coherent 
framework. 

A little History 

In 1972, Terry Baker's article "Chief 
Programmer Team Operations" in the 
IBM Systems Journal had a major impact 
on the field. It brought together several 
ideas: structured programming, top- 
down design and implementation, the 
chief programmer, the ch ref -program- 
mer team , and the documentation librar- 
ian. If there was a shot that started the 
"structured revolution" in the U.S., 
Baker's article was it. 

In the early 1970s, I became inter- 
ested in structured programming and in 
structured design. In applying the princi- 
ples of top-down design, I discovered that 
many of my best, most intelligible solu- 
tions were those in which the hierarchi- 

cofttwued 
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eal structure of the program mirrored the 
hieraichical structure of the data the pro- 
gram was processing. 

Shortly after this discovery, i stum- 
bled across the work of iean Warnier and 
realized that he not only had made the 
same discovery with regard to data- 
structured programming but had already 
built a systematic methodology around 
it. I also followed Michael Jackson's 
work, another form of data-structured 
design. 

I already believed that you could and 
should construct programs hierarchi- 
cally using only a few basic logical struc- 
tures. Moreover, I believed that if you 
were going to build very large things, you 
should build them in systematic ways 
based on simple structures. This coin- 
cided with design and construction tech- 
niques used in fields such as electrical 
engineering. Structured programming 
represented a base on which to build; 
therefore, using the data structure as the 
framework for building the program 
structure seemed like the next natural 
step. 

Data-structured programming meant 
that you could create predictably correct 
solutions r a wide class of program- 



ming problems— problems in which the 
structures of the input and the output 
were the same or very similar. But be- 
yond that, Warnier, Jackson, and those of 
us involved in developing DSSD were 
able to extend data-structured techniques, 
to arbitrarily complex programs. 

To solve these more complex prob- 
lems, you must recognize that the nature 
of the problem of complexity is, on one 
level at least, fundamentally mathemati- 
cal in nature— that is, complex problems 
are fundamentally n:n (many-to-many) 
mappings from input to output. To deal 
with this complexity systematically, you 
must break the problems down into a 
series of less complex mappings. 

This is what mathematicians have 
been doing for thousands of years— 
breaking large troublesome problems 
into smaller ones for which there are 
clear precise answers. In the case of 
data-structured design, this meant devel- 
oping a scheme in which the physical in- 
puts were mapped into logical inputs; the 
logical inputs were then mapped into the 
logical outputs; and, finally, the logical 
outputs were mapped into the physical 
outputs. 

With this overall program-design 



framework comes a goal-oriented design 
strategy— an approach that starts with 
the structure of the output and works 
backward, first to the logical, or ideal, 
input, and then to the physical input. 

The data-structured approach to pro- 
gram design has proven to be successful 
on a wide variety of problems, but it is 
clearly no panacea. What it does repre- 
sent is a systematic approach to attacking 
complex problems (simple problems have 
a way of taking care of themselves, or,, 
alternatively, becoming complex). 

Programming in the Large 
At some point in developing techniques, 
for building systems, you realize that the 
most significant problems in software 
occur not at the programming level but at 
the systems level. How do you design en- 
tire suites of programs so that they wor£ , 
effectively together? How do you get the ^ 
right requirements? Where does plan|r. 
ning fit into the scheme of things? 

Little by little, DSSD moved from j^; 
program-design methodology to a sys-:; 
tems-design methodology. Over a period 
of years, the methodology was expanded ; 
to deaf with database design, require^' 

continued t 
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Figure 1 : At the systems level, instead of working backward to the ideal inputs, DSSD works backward to the 
and then to the inputs. 
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meats definition, and finally systems 
planning and architecture. 

At a conceptual level, DSSD still re- 
tains features that characterized it at the 
programming level. For example, it still 
focuses (in its design phase) on working 
backward from outputs. But a! the sys- . 
terns level, instead of working backward 
to the ideal inputs, as it does in the pro- 
gramming methodology, DSSD works 
backward to the logical database and 
then to the inputs (see figure 1). the log- 
ical database turns cut, not surprisingly, 
to be a normalized relational database. 

While a complete definition of the re- 
suits (outputs plus algorithms) is an ex- 
cellent point at which to begin the design 
process, it is not the proper place to start 
requirements definition. So, over the 
years, DSSD has been extended to cover 
first the context, then the functions, and 
. finally the results of the system in ques- 
tion. 

Thus, a number of tools were needed 
to facilitate this process. Entity diagrams 
help you define the systems context, and 
assembly-line diagrams (a mcdifi ed form 
of Warnier/Orr diagrams) help you de- 
fine the functional flow of the system. 



Data-structured methodologies have, I 
believe, a leg up on more process-ori- 
ented methodologies, since they are 
more rigorous and hence provide a better 
basis for true integration throughout the 
systems life cycle. DSSD has been used 
successfully on a range of software 
systems, from commercial on-line sys- 
tems to real-time control systems. Thou- 
sands of people have been trained and 
thousands of systems have been built 
using it. 

DSSD is a software engineering ap- 
proach that has provided a stable frame- 
work for incorporating new technologies 
as they come along. For example, we 
have incorporated prototyping, on-line, 
and real-time design into DSSD without 
sacrificing the «■ rigor or completeness. 
But there is a catch: To use DSSD sue- 



U you must invest time in train- 
ing, use, and automation. In software en- 
gineering, as in life, there is no free 
lunch. 
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The Gane/Sarson Approach 

Chris Gane 



When we think about an infor- 
mation system that doesn't 
exist yet, our ideas are usu- 
ally pretty vague and gen- 
eral. This is not an accusation; it's a fact 
of human psychology. 

The purpose of logical modeling is to 
take these necessarily vague ideas about 
requirements and convert them into pre- 
cise definitions as fast as possible. Part 
of thelpeed comes from having graphi- 
cal techniques that enable you to put 
down the essence of a system without go- 
ing through the trouble of actually physi- 
cally implementing it, as you might do, 
for example, in a prototype. 

Several approaches to logical model- 
ing have been proposed. The one out- 
lined here is the current version of the ap- 
proach set out in a book I wrote with 
Irish Sarson (see reference 1). It has be- 
come generally known as the Gane/Sar- 
son methodology. 

Logical Modeling 

You can think of logical modeling as a 
seven-step process. Suppose the users 



say, "We need a system that integrates 
sales, inventory control, and purchas- 
ing." What exactly does that mean? 

• Step I. Develop a system-wide data- 
flow diagram (DFD) describing the un- 
derlying nature of what occurs in the 
sales, inventory control, and purchasing 
areas of the business. The simplicity of 
the DFD comes from the use of only four 
symbols to produce a picture of the un- 
derlying logical nature of any informa- 
tion system, at any desired level of detail. 

Figure 1 shows CUSTOMERS (an ex- 
ternal entity, something outside the sys- 
tem) sending in a stream of sales orders 
along the data-flow arrow. Process 1, 
process sales, handles those orders using 
product information from the data store 
called Dl: PRODUCTS and puts infor- 
mation about sales into the data store 
named D3: SALES. 

This figure also shows the whole of 
the business area, depicted using only the 



four symbols. For each sale, process 1 
updates the INVENTORY data store, 
D2, with the units sold. The data stored 
in D3 is used by processes 2 and 3 to pre- 
pare bank deposit documents and send 
mem to the bank, and to prepare sales re- 
ports and send them to management. 

At some appropriate time — notice that 
time is not shown on the DFD— process 4 
extracts information about the inventory 
status of various products from D2 and 
combines it with information from D3 
concerning their past sales, to determine 
whether a product needs to be reordered. 
If so, based on information in D4, which 
describes the prices and delivery times 
quoted by suppliers, process 4 chooses 
the best supplier to order from. 

Process 4 sends purchase orders to the 
external entity SUPPLIERS and stores 
information about each purchase order in 
D5: POS_IN_PROGRESS. When a 
shipment is received from a supplier, 
process 5 analyzes it, extracting data 
from POS JNJROGRESS to determine 
whether what has been received is what 
continued 
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was ordered, incrementing the INVEN- 
TORY, D2, wiih the accepted amount, 
and storing t v .ie accepted quantities in 
POS J N_PROGRESS . 

This DFD achieves three things. First, 
it sets a boundary to the area of the sys- 
tem and of the business covered by the 
system. Things represented by the exter- 
nal-entity symbol (i.e., customers, the 
bank, management, and suppliers) are, 
by definition, outside the system. Pro- 
cesses not shown are not part of the proj- . 
ect. For example, the diagram shows re- 
ceiving shipments from suppliers but not 
handling the invoices received from 
them, implying that accounts payable is 
outside the scope of the project as well. 

Second, it is nontechnical. Nothing is 
shown on a DFD that is not easily under- 
standable to people familiar with the 
business area depicted, whether or not 
they know anything about computers. 

Third, it shows both the data stored in 
the system and the processes that trans- 



form that data. It shows the relationships 
between the data and the processes in the 
system. 

• Step 2. Derive a first-cut data model— 
, that is, a list of the data elements to be 

stored in each data store, as defined on 
the DFD. You should draw up this list 
from your own knowledge and from the 
knowledge of users about what informa- 
tion you need to describe a product, a 
supplier, a sale, and so on. 

You can refine the list by looking at 
each system input, such as sales orders or 
shipments in figure 1 , determining what 
data elements each input represents, 
looking at each output in the same way, 
and then working from the outputs back 
to the data stores or from the inputs for- 
ward to the data stores. 

• Step 3. See what entity-relationship 
• analysis can tell you about the structure 

of the data to be stored in the system. 



..First, you ask, " What are the entities of 
interest about which I may need to store 
data?" For this business, the answer 
might be CUSTOMERS, PRODUCTS, 
INVENTORY i SUPPLIERS, SALES, 
and PURCHASE_ORDERS. Then, you 
create a diagram with a block for each 
entity you have identified. (It is conven- 
tional in this diagram to state the entities 
as singular nouns— for example, CUS- 
TOMER instead of CUSTOMERS.) . 

. Next, looking at each pair of entities 
on the diagram, you ask, "What, if any, 
relationships exist between them?" For 
example, you know that one customer 
may be associated with many sales, but 
each sale can be for only one customer. 
This is conventionally shown by a line 
with an arrowhead against the "many" 
block and a plain line at the "one" block. 

Take, for instance, PRODUCT and 
SALE: One product may be associated 
with many sales, and one sale may be for 
many products— at least one, and possi- 
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Figure t: A DFD far the whole of the business area. Note the box for external entities, the open rectangles for data stores, the 
rounded box for the process, and the data-flow arrow, which shows the direction of data movement. Notice also that time is not 
shown on a DFD. 
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bly more. This relationship is shown by a 
line with an arrowhead on both en/is. On 
the other hand, each product his only 
one inventory record, and each i jventory 
record refers to only one product. Conse- 
quently, they are joined by a simple line. 
Adding in all the identifiable relation- 
ships creates a diagram like figure 2. 

« Step 4. Use all the information you 
have about the data so far to describe the 
data model as one made of linked, two- 
dimensional tables. These tables should 
be normalized (i.e., made as simple as 
possible). One way to summarize the 
rules of normalization is to say that in a 
properly simplified table, in which a col- 
umn or combination of columns uniquely 
identifies each row (the key), each non- 
key column should depend only on the 
key. 

• Step 5. Redraft the DFD to reflect a 
more precise view of the system data as a 
result of entity-relationship analysis and 
normalization. 

• Step 6. Partition this logical model of 
process and data into procedure twits— 
that is, chunks of automated and manual 
procedures that can be executed (and 
therefore developed) as units. To do this, 
you consider each input and output and 
ask the following questions for each one: 

1 . When does it happen? 

2. How large an area of the DFD is 
involved in handling or producing 
it? 

3. Can that area be implemented as a 
single unit? If not, why not? 

° Step 7. Specify the details of each pro- 
cedure unit that will be required to im- 
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Figure 2: All die identifiable relationships between entities. For each pair of entities 
on the diagram that has a relationship between its elements, die relationship may be 
one-to-one, one-to-many, many-to-one, or many-to-many. This diagram provides a 
lot of information about the system in snowing all the relationships that exist between 
the entities involved. 



plement the system. A procedure-unit 
speci F cation may involve 

1 . an extract from the system DFD 
showing where this procedure unit 
fits into the rest of the system; 

2. details of the tables accessed by 
the procedure unit; 

3. layouts for any screens and reports 
involved in the procedure unit; and 

4. details of the logic and procedures 
to be implemented, written in 
structured English or some other 
unambiguous form. 

With the nature of the procedure unit de- 
Fmed, you can decide whether it should 
be prototyped or implemented directly in 
the target language. You can develop the 
screen and report layouts by prototyping. 

Steps 6 and 7 in this sequence are not, 
strictly speaking, logical modeling, 
since they deal with converting the logi- 



cal model into a physical model. They 
are included, however, because they 
form part of the natural flow of thought 
processes beginning with defining the 
system and ending in its physical design. 

Editor's note: Chris Gone extracted this * 
article from Chapter I of his book Rapid 
System Development, published by Pren- 
tice-Hall in December 1988. 
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The Yourdon Approach 

Edward Yourdon 



The Yourdon method is a generic, 
ecumenical collection of soft- 
ware engineering ideas devel- 
oped over the past 20 years by a 
variety of people who have worked at 
Yourdon, Inc. Taken together, these 
ideas are often referred to as structured 
techniques: structured programming, 
structured design, and structured 
analysts. 

Because of the continuing influx of 
new ideas from new people, the Yourdon 



method is constantly evolving. The 
method that thousands read about in Tom 
DeMarco's book in 1978 (see reference 
1) has changed considerably in the past 
10 years. And the Yourdon method of 
1989 is evolving to incorporate the best 
ideas of object-oriented design and 
analysis. 

But what is the "Yourdon method" to- 



day? It consists of two things: tools and 
techniques. The tools are a variety of 
'graphical diagrams used to model the re- 
quirements and the architecture of an in- 
formation system. The most familiar of 
these tools is the data-flow diagram 
(DFD) (see Figure 1). The original DFD 
notation was extended a few years ago to 
support real-time systems; a real-time 
DFD includes control flows and control 
processes. For a detailed description cf 
con tin u e d 



APRIL 1989 • B Y T E 227 



IN DEPTH 
METHODOLOGY 




Cider 
details 



*<i ORDERS ) 
v / 



Shipping - 
details 



f •• ^ Customer name. 
— RECEIVE j easterner adcvess 
\ ORDf:R . 

1_T 



WAREHOUSE 



Orders 




FSgtare /I data-flow diagram. The DFO models the functions that a system must 
perform. 



real-time DFDs, see reference 2. . 

While the DFD is an excellent tool for 
modeling the functions that a system 
must carry out, it says little or nothing 
about data relationships and time-depen- 
dent behavior. Thus, the current Yourdon 
method also includes entity-relationship 
diagrams (ERDs) and state-transition di- 
agrams (STDs) (see reference 3). 

After you have finished describing the 
system requirements, you can use a 
structure chart to illustrate the organiza- 
tion of modules that will implement those 
requirements. A number of guidelines 
exist that the systems analyst can follow 
to ensure that each diagram is complete 
and logically consistent. 

While the graphical diagrams provide 
an effective way of communicating infor- 
mation about different aspects of a sys- 
tem, they don*t tell the whole story. For a 
complete system description, you need 
additional textual support: a data dictio- 
nary , which describes the composition of 
each data element, and a set of process 
specifications that describe the required 
behavior of each bottom-level "bubble" 
in the DFD. 

Hie Techniques 

The techniques of the Yourdon method 
consist of some "cookbook" guidelines 
that help you go from a blank sheet of 
paper, or a blank screen, to a well-orga- 
nized system model. Originally, these 
guidelines were based on the simple con- 
cept of top-down partition ing of system 



functions (e.g.. draw a single bubble or 
box 10 represent the entire system, then 
draw lower-level bubbles or boxes to rep- 
resent subsystems, and so forth). 

Today, the Yourdon method uses a 
technique known as event partitioning 
(see reference 4). This approach begins 
by drawing a top-level context diagram to 
identify the system boundary and to de- 
fine the interfaces between the system 
and external sources and sinks. Then, 
after interviewing the user, you can write 
a list of the events that occur in the exter- 
nal environment and to which the system 
must respond. (Events are often input 
transactions.) 

The event-partitioning approach pro- 
vides a simple guideline to help you com- 
pose a first-cut crude DFD: For each 
event, draw one bubble whose function is 
to provide the required response to the 
event. (In most cases, the response in- 
volves generating an output, but it may 
also involve storing some information in 
a data store to be used by some subse- 
quent event.) 

For a system with 100 events, the DFD 
would have 100 bubbles. This is too com- 
plex to work with, so the event-partition- 
ing technique provides guidelines to help 
you partition upward— that is, to gather 
several of the DFD bubbles together and 
represent them by a single bubble in a 
higher-level DFD. The strategy for de- 
ciding which bubbles should be grouped 
together is to look for bubbles thai deal 
with common data (e.g. , a common data 



store), tn this sense, event partitioning is 
very similar to the object-oriented design 
approach. 

.. There sre various additional guide- 
1 ines and techniques to help you compose 
well-formed models of both system re- 
quirements and system architecture. 
(One book that discusses both the analy- 
sis area and the design area—as well as 
the "twilight zone" that separates the 
two— is given in reference 5.) 

The Yourdon Philosophy 
Throughout all of the Yourdon method- 
regardless of variant or dialect, whether 
you draw circles or ovals in your DFD, or 
where you hear about it— you will see the 
fol lowing ph i losophtcs. 

» Modeling is good. Developing a model 
of a system before you build it is almost 
always a useful, educational activity. For 
this to work, however, the model has to 
be inexpensive and easy to build: If it 
costs as much to develop the model as to 
develop the system, h*s obviously a waste 
of time. The model also has to be accu- 
rate—it should not mislead you or lie to 
you. And it should be easy to understand: 
It should highlight those aspects of the 
system that are important, and it should 
deemphasize or hide those aspects that 
are unimportant or uninteresting. 

Since most systems are complex in 
three different dimensions- functions, 
data, and timing and control— it is useful 
to have three different types of models. 
DFDs, ERDs, and STDs, each of which 
illustrates a single perspective of the sys 
tern. The Yourdon method is based on ab- 
stract, pictorial models— either on paper 
or on a computer screen. 

Another approach is to develop a pro- 
totype of the system as a model— a living, 
breathing model instead of a passive col- 
lection of diagrams. When prototyping 
was first introduced in the early 1980s, it 
was considered an alternative to paper- 
based modeling approaches— the sys- 
tems analyst was often told to make a 
binary choice between prototyping and 
drawing DFDs. 

Today, we know that the two ap- 
proaches are complementary: You can 
draw diagrams as a permanent record of 
system requirements and use prototyping 
to experiment with such key issues as the 
user interface (input screens, report lay- 
outs, and so on) . For a good discussion of 
the marriage of prototyping and "classi- 
cal" structured analysis, see reference 6. 

° Iteration is good. As fallible humans 
with limited intelligence, we rarely, if 

continued 
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ever, develop a perfect solution to a com- 
plex :problem on the first attempt. At 
best, wr; can hope for a crude beginning 
tliat^through iteration, we can gradually 
reflne^and improve. To practice itera- 

^riab/iwe must have models that W easy 
to" create and easy to revise. In the past,. 

Cwe grappled with the finality of pen and 
ink; with 'the word processors'of today, 
most of us take iteration for granted in 
composing reports and memos. - v 

: '. In systems development, we tried to 
make iteration of system models easier 
by insisting on partitioning the overall 
system model into a number of separable 
submodels. Thus, if one aspect of a sys- 
tem changes, ideally only one page of a 
diagram has to be modified. As a practi- 
cal matter, though, most systems ana- 
lysts, in the 1970s and early 1980s drew 
DFE)is only once—on paper. This is one 
reason, why today's microcomputer- 
based computer-aided software engi- 
neering products are so important: They 
make iteration a practical reality. 

- • Partitidning is good. When we. first 
learned how to write computer- pro- 



grams, we were given simple problems 
that we could finish in a day or two, 
keeping every aspect of the problem in 
our heads at once. With real-world pro-, 
gramming problems; however, the only, 
way we can successfully build systems, 
that; today, typically involve more than a, 
million lines of code is by partitioning 
the. system into smaller and smaller 
• pieces. * ■« " ' ' 

, - There are great debates about whether 
the partitioning should be based on f unc- 
. tional decomposition or data decomposi- 
. tion. But either approach, followed rigor- 
ously, is better than no partitioning, or 
sloppy partitioning that leads to subsys- 
tems with subtle, pathological intercon- 
nections. 
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The Entity-Relationship Approach 



Peter P. Chen 




ne of the major problems in 
software engineering today is 
the piecemeal approach to sys- 
tems design. This approach 
makes the integration of different appli- 
cation systems difficult, if not impossi- 
ble. We try to design the data structures 
and formats to fit current processing 
needs and then later run into problems of 
data conversion and integration. 

An integrated database is a solution to 
these problems. However, acquiring a 
DBMS does not make them go away. 
What is needed is a structured method- 
ology that can systematically convert 
user requirements into well-designed 
databases. Hie entity-relationship (ER) 
approach is such a methodology. 

Let's start with an example. Say you 
need a program to keep track of the list of 
employees working for each department 
in your company. This program needs to 
accept data on the screen, store it on disk, 
and print out the report on demand. The 
programmer/analyst comes up with a file 
format (see figure la). 



In the meantime, another group in the 
company implements a program to keep 
track of employee information for each 
project; the file format in this program 
turns out a little different from the other 
(see figure lb). Each program satisfies 
the needs of the group that requested it. 
However, one day the company president 
wants to know which departments have 
employees working on project X. Then 
everyone scrambles around trying to con- 
vert the data in one file to the format of 
the other file. Let's look deeply into 
these two file formats to see what kinds 
of problems they had. 

• Synonym (the same data element 
has different names): For example, 
SOC_SEC_NO in figure la is the 
same data element as SS# in figure 
lb. 

• The same name for different data 
elements: For example, NAME in 



figure la refers to the name of an 
employee, while NAME in figure 
lb is the name of a project. 

• Incompatibility of data formats: 
For example, the data-type format 
of AGE in figure la is Int(2), while 
the data format of AGE in figure 
lbisReal(3.2). 

• Duplication of data: For example, 
the project data (PROJ#, NAME) 
is duplicated for each employee 
associated with the project in 
figure lb, and the BUDGET data of 
each department in figure la is 
repeated for each employee. 

• Update anomalies: For example, 
changing any of an employee's 
data-element values in one file but not 
in the other will result in 
inconsistent data. 

If the above file designs are not good, 
what would be a good design? How many 
record types (or relations in the case of 
relational databases) should there be? 
Should there be one huge record type 
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consisting of all data elements, or the 
other extreme— many small records, 
each consisting of a pair of data ele- 
ments? Furthermore, wnat is the pri- 

. mary key for each record (relation) type? 
.The main question is: Do we have a 
methodology - for file and ' database de- 
sign? The answer is yes, and the leading 

1 methodology is the ER approach. • 
■■* Six years ago,' a survey, of Fortune 500 
companies (published in ACM SIGMOD 
proceedings, 1983) conducted by two 
Ohio professors showed that the ER 
methodology ranked as the most popular 
methodology in data modeling and data- - 
base design. Why? Because it is simple, 
easy to understand by noncomputer peo- 
ple, and theoretically sound. To illus- 
trate, here are the major steps of the ER 
approach using the above example: 

• Develop an entity-relationship dia- 
gram (ERD). This step identifies ER 
types -and associated attributes and also 
the primary keys for each entity type. 

An entity is a thing (e.g., a person or 
an automobile), a concept, an organiza- 
tion, "or an event of interest to the organi- . 
zation doing the modeling. An entity 
type is a classification of entities satisfy- 
ing certain criteri< A relationship is an 
interaction between entities. A relation- 
ship type is a classification of relation- 
ships based on certain criteria. Usually, 
nouns in English correspond to entities, 
while verbs correspond to relationships. 

In the example in figures la and lb, 
you can identify three entity types: 
DEPT, EMP, and PROJ. You can also 
identify two relationship types: HAS and 
WORK_FOR (note that relationship-type 
names are verbs). Figure 2 depicts an 
ERD in which rectangular boxes repre- 
sent entity types and diamond-shaped 
boxes represent relationship types. 

The next step is to identify the cardi- 
nality of the relationship types. The car- 
dinality of HAS (between DEPT and 



. EMP) is 1 :n (one-to-many); that means a 
.department can have many employees, 
but each belongs to at most one depart- 
ment. The cardinality of ^WORFt-FOR 
(between EMP and PROJjjs n:n (many- 
to-many). You then identify theproper- 
ties. (attributes) of each ER and express! 
them graphically as circles (or ellipses). 
For example, each DEPT has attributes 

i DEPW and BUDGET. The primary key 1 
is indicated by a double circle: Note that 
there is an attribute called % TIME for 
relationship WORKER: * 

• Convert the ERD into conventional file 
and database structures. There are rules 
* for doing this. For example; you can con- 
vert the ERD in figure 2 into the rela- 
tional structure with all the primary keys 
underlined. ; 

DEPT(DEPT#, BUDGET) 1 
EMP(SS#, NAME, AGE, DEPT*) 
PROJECT (PROJ#, PNAME) . 
WORKJFOR(SS#, PROJ0, %TIME) 

. Simply- speaking, each entity type is 
, converted into a relation, and a relation- 
ship type is converted into a stand-alone 
relation or consolidated with another re- 



lation, depending on the cardinality of 
the relationship. ; , ' 

IfyouarefarruUiarwimrela^drialnbr- 
: realization theory, you can' prove that"" 
; these delations are in Third Normal J 
; Form, .^s you can see, all.the>riniary ■ 
u keys of the relations are derived automat- 
'j* ically; and DEPT# in EMP relation is a 
, foreign key (i:e;, the primary key of an- * 
■ u other relation— DEPT). ' ^ J v- « 

: ° Develop application programs bused 
^ on the file and database structures. If - 

you are using a relational DBMS, you can 
f now write a System Query Language" 

(SQL) program to express the question, 

Which departments have employees 

working on project X? 

SELECT EMP DEPT* -;: - 

FROM EMP, WORK_FOR 
WHERE (WORieFOR:PROJ# == X) ^ ^ 
■r. AND(WORKJFOR.SS# = EMP.SS^ ? 

' ' • -\ 

" * ' This article shows how to design a re- ? 
lational database based on the ER ap- 
proach. Similarly, you can- design file 
structures and various other databases-^ ; 
r from microcomputer-based DBMSes, 

continued 
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Figure 1: (a) File format far the program to list employees in each department. 
(b) File format for the program to list projects for each employee. 




Figure 2: Entity-relationship diagram (ERD) for a database based on figure 1. 
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such as dBASE, to main frame-based 
DBMSiS, such as DB2 and IMS— based 
on the ER approach. . 

Future Trends ' 

You have seen how to design a database 
and an application program based on the 
ER approach. The resultant database is 
sound and avoids such problems as data 
duplication and update anomalies. Com-j 
mercia) tools are available today to auto-' 
mate the ER approach . 

The ER model can be used not only as 
a design tool but also as the underlying 
model for a DBMS. In the microcom- 
puter and minicomputer range, Zanthe 
(Ottawa) has a product called ZIM. In 
the mainframe area, several computer 
vendors have ER-like DBMSes ready for 



marketing. For example. Software AG 
has ADABAS/Entire, and Unisys has 
SIM as part of its InfoExec offering. 
' - On another front, ANSI recently ap- 
proved an Information Resource Dictio- 
nary Systems standard based on the ER 
model. In the near future, we'll see a 
flood of IRDS products as well as com- 
puter-aided software engineering tools', 
based oh the ER model. 
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The Structured-Design. Approach 



Larry L. Constantine 



he computer field likes big- 
words. Why call something an 
instance when instantiation 
works just as well, even if it 
isn't in the dictionary? A software design 
method sounds like the sort of generic- 
brand thinking that anyone could work 
out over a long weekend. But a software 
design methodology sounds like an elab- 
orate and well-thought-out concept, per- 
haps worth attending a seminar on by a 
major software guru, and certainly 
worth the price of a book. However, 
methodology actually means the study of 
methods, and software methodology is an 
ungrammatical use of the word. 

Structured design is both a generic 
term for various systematic approaches 
to designing program structure and also 
a kind of brand name for one particular 
approach. The structured design world is 
a competitive arena. Varying principles 
or specialized diagrams reflect some 
real technical differences. But, more 
than anything else, competing approach- 
es are based on product differentiation, 
personal ego, and the territorial impera- 
tive. These methods, their associated 
tools, and the names of the principals are 
widely recognized. 

The school with the longest legitimate 
claim to the banner of structured design 
is the Constantine-Myers-Stevens-Your- 
don (in alphabetical order, of course) ap- 
proach that I originated in the late 1960s. 
It begins with a data-flow diagram 
(DFD) (often called a "bubble chart") 



showing the transformational structure 
of an information-processing problem; 
then it derives a model of the modular 
structure of software that will solve that 
problem. 

Models 

Much is made of design methods, but 
structured design is really powered by a 
troika consisting of models, methods, 
and measures (see figure 1). The models 
make it possible to picture and play with 
the modular structure of software sys- 
tems w'lhout actually having to program 
them first. 

System-structure modeling, now ac- 
cepted as essential to software engineer- 
ing, was a novel and suspect notion when 
I first introduced it. The models used in 
structured design are graphical tools, an- 
notated to represent the structure of 
problems and programs. For example, 
the structure chart, an elaboration of the 
older hierarchy chart, shows all the mod- 
ules in a system and their essential inter- 
relationships in one compact model. It 
allows you to see the "shape .of things to 
come" and to explore alternative ways to 
organize software. 

Measures 

Structured design, unlike some other 
structured techniques and software engi- 
neering "methodologies," is grounded in 



a body of underlying theory about what 
makes programs complicated to buiid 
right in the first place and difficult to 
change in the second. The practical em- 
bodiment of this theory takes the form,of 
two measures— coupling and cohesion— 
that index the relative complexity or dif- 
ficulty of various designs. 

Simple programs are, simply put, 
made out of little pieces, each of which is 
easily thought of as a unit or a whole that 
is mostly independent of other pieces. 
Module cohesion is a measure of module 
"wholeness," and coupling measures in- 
terdependence. In other words, good de- 
signs that are easy to build and change 
are based on a bunch of modules, each of 
which is "cohesive," or well-glued to- 
gether, and only loosely "coupled" to 
other modules. 

Designing in this way, you can pro- 
gram very large systems by writing only 
small, separate pieces of code. This 
theory is proving to be the most durable 
element of the troika. A decade of re- 
search has demonstrated the soundness 
of the basic assumptions about coupling 
and cohesion and has refined our under- 
standing of how they affect program- 
ming and maintenance costs, fault rates, 
and ease of modification. Quantitative 
metrics based on the theory now make it 
possible to automate design evaluation 
and even parts of the design process. 

Object-oriented methods are emerging 
as major factors in software engineering, 
but even with these powerful new tech- 
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•ni<|ise5,-.(he cohesion of «aua! modules- fteir thinking without ever applying ihe workbenches are methodology-indeperi- 
ihai implement object classes and their- forma! models, measures, and methods. dem— meaning you can use them to do 
coupling with oihe- modules turn but to any brand of design, structured or un- 

?kt important tor building simple systems Automating Methods structured, that you may choose.'* 

iwith .ir^y reusable cdm>>onents. AW Computer-aided software engineering is this may be an acceptable level of 

•*c!>naeqi>ehcc : !he : origins! measures :of not a replacement or substitute for sys-.: - flexibility, given the state ofthe science 
coupling and cohesion have now been ex-' tematic methods; it's the key that uri- ■ in "computer science," but some of these 

fended land adapted to evaluate -thequal- .locks their full potential. With CASE , tools have arbitrarily departed from what 
ity of so-called object dass modules and - tools, real structured design on problems ' few standards and conventions do exist, 
abstract data types. J ' ; ■ \ ( r of interesting size becomes truly practi- It doesn't make sense for a system to use 

■• cable for the first time. The diagramed)- its own peculiar icon for an included 

Methods tors of CASE systems assist in develbping" module any more than to allow a CAD/ 

Methods' is the weak ini;d member 'of the' and refining complex graphical models. ; CAM system for electrical engineers to 
structured design team, but the one that CASE tools with built-in knowledge of use just any old shape to represent a 
gets; much of the attention The methods the models' meaning can check them for . transistor. v . 

of structured design include a loose cb!- consistency and conformance with the If we use computer-aided software en- 
lection of rules and some moderately sys- established rules of structured design. gineering tools, and we call ourselves 
tematic strategies for the step-by-s!ep de- With intelligence about good designs software engineers, perhaps it's time we 
sign of software. These strategies are incorporated into CASE tools, the com- acted more like engineers, a 
based on specific kinds of software orga- puter can evaluate the quality of actual * • - ■ 

nizat ion* that have proved effective in structured designs or even sketch out a BIBLIOGRAPHY 
practice. Among the most durable are the rough initial design. At present, the most Stevens, W. P., G.J. Myers, and L. L. 
balanced system structure known- as advanced CASE workbenches provide ■ "Constantine. "Structured Design " IBM 
transform-centered organization and the powerful support for existing software ' Systems Journal, vol. 13, no. 2, 1974. 
event-oriented organization called trans- engineering methods; in the future, new ' Yourdon, Edward, and Larry L: ; Constan- 
action-centered. Distinct design methods structured methods are likely to be de-/ tine. Structured Design. Englewobd 
are aimed at producing each variation on veloped based on the use of quasi -inteiij- ' Cliffs, NJ: Yourdon Press/Prentice- 
system organization. 1 gent CASE tools. / T 'Hall, 1979. 

■': -You might outline an overall struc : CASE systems can also reinforce or* ' ■- — — 

tu red design method as fol lows : impose standards for software architec- . Larry L Constamine is a software meth- 

tures and the diagrams that document odologist in Acton, Massachusetts. He is 
i Develop a nonprocedural them, but CASE vendors have yet to take a well-known author and speaker in the 

(method-independent) statement on much responsibility for standardiza : field of structured design and analysis. 

of the system requirements, usuaily tion. Most of the available tools and He can be reached on BIX c/o "editors." 
centered around a DFD. 

2. Based on the structure of the 
problem, choose an appropriate 
software organizational model or 
combination. 

3. Guided by the data flow and the 
chosen organizational model, 
decompose overall functions into 
subf unctions and compose 
primitive functions into higher-level 
functions until the complete 
requirements are satisfied. 

4. Using various design rules and 
measures of coupling and 
cohesion, refine the design for 
increased modularity, 
extensibility, and likely reusability 
of modules. 

5. Complete detailed designs as 
necessary for all modules. 

The problem, of course, is that actu- 
ally carrying out the modeling, evalua- 
tion, and refinement involved in struc- 
tured design takes discipline and time, 
uses a lot of paper, and can wear out the 
erasers on every pencil in the office. Un- 
aided by computer tools, many develop- 
ers who use structured design have used 

it informally and unsystematkally, while Figure 1: The troika of structured design: models, methods, and measures. 
others have used its concepts to shape 



Models 




/ 



Intermodular 













TransJram analysis 




Intermodular 






cohesion 































APRIL 198? • BYTE 233 



